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ARTICLE INFO ABSTRACT 


Keywords: To help reveal the diversity and evolution of bat coronaviruses we collected 1067 bats from 21 species in China. 
Bats A total of 73 coronaviruses (32 alphacoronaviruses and 41 betacoronaviruses) were identified in these bats, with 


Coronavirus an overall prevalence of 6.84%. All newly-identified betacoronaviruses were SARS-related Rhinolophus bat 
seat coronaviruses (SARSr-Rh-BatCoV). Importantly, with the exception of the S gene, the genome sequences of the 
ea oo SARSr-Rh-BatCoVs sampled in Guizhou province were closely related to SARS-related human coronavirus. 
SARS Additionally, the newly-identified alphacoronaviruses exhibited high genetic diversity and some may represent 


novel species. Our phylogenetic analyses also provided insights into the transmission of these viruses among bat 
species, revealing a general clustering by geographic location rather than by bat species. Inter-species 
transmission among bats from the same genus was also commonplace in both the alphacoronaviruses and 
betacoronaviruses. Overall, these data suggest that high contact rates among specific bat species enable the 
acquisition and spread of coronaviruses. 


1. Introduction 


Coronaviruses (CoVs; family Coronaviridae) are enveloped posi- 
tive-sense, single-stranded RNA viruses with the largest genomes (25-— 
31 kb) among known RNA viruses (de Groot et al., 2011). Based on 
genome-scale phylogenies the known CoVs are classified into 30 
species within four genera: Alphacoronavirus, Betacoronavirus, 
Gammacoronavirus, and _ Deltacoronavirus’ (ICTV, 2017). 
Coronaviruses can infect humans, other mammals, and birds, causing 
respiratory, enteric, hepatic, and neurological diseases of varying 
severity (Masters and Perlman, 2013). Coronaviruses are well known 
globally due to the emergence of severe acute respiratory syndrome 
(SARS) during 2002-2003 caused by a previously unknown CoV 


(Ksiazek et al., 2003; Peiris et al., 2003). Subsequently, other two 
human CoVs (NL63 and HKU1) causing respiratory disease were 
identified (van der Hoek et al., 2004; Woo et al., 2005). Strikingly, 
the Middle East respiratory syndrome (MERS) that emerged in 2012 
and characterized by a higher mortality than SARS was also caused by a 
previously unknown CoV (Bermingham et al., 2012; Zaki et al., 2012). 
The ongoing emergence of these CoVs in humans means that CoVs will 
likely remain a key public health threat for the foreseeable future. 
Since the discovery of SARS-CoV in Himalayan palm civets (Guan 
et al., 2003), intense effort has been directed toward identifying and 
characterizing coronaviruses from animals globally. Consequently, a 
number of CoVs have been identified in a diverse range of vertebrates 
including domestic and wild mammals, and birds (Poon et al., 2005; 
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Wang et al., 2015; Woo et al., 2012). Bats (order Chiroptera), with 
more than 1240 species, have remarkable species diversity, and 
comprise more than 20% of living mammalian species (Nowak, 
1994). The discovery of SARS-related CoV in Rhinolophus horseshoe 
bats in China in 2005 (Lau et al., 2005; Li et al., 2005) attracted the 
global attention to these mammals, such that diverse Alpha- and Beta- 
CoVs have now been identified in a variety of bats globally over the past 
decade (Corman et al., 2013, 2014, 2015; Drexler et al., 2014; He et al., 
2014; Huang et al., 2016; Smith et al., 2016; Woo et al., 2012). More 
importantly, due to the close relationship between CoVs in bats and 
those causing human infections, it is believed that bats are the original 
source of human CoVs including SARS-CoV and MERS-CoV (Corman 
et al., 2015; Ge et al., 2013; Huynh et al., 2012; Ithete et al., 2013; Tao 
et al., 2017). Due to their high diversity and biological and ecological 
characteristics that potentially facilitate virus maintenance and trans- 
mission, bats likely harbor a large number of viruses, some of which 
may then jump to other species (Balboni et al., 2012). Hence, more 
effort is needed to identify and characterize the currently unrecognized 
CoVs that circulate in bats. 

At least 120 bat species are found in China, mainly distributed in 
the eastern, central, and southern regions of that country (Zhang et al., 
1997). Herein we report novel and diverse CoVs and SARS-related 
CoVs identified in Rhinolophus, Miniopterus, Murina and Myotis spp. 
bats sampled from several geographic regions of China. In addition, we 
inferred their genomic characteristics and evolutionary relationships 
with known viruses and their hosts. 


2. Results 
2.1. Bats collected and prevalence of CoVs 

During 2012-2015 a total of 1067 bats were collected from five 
caves in five counties from Guizhou, Henan, and Zhejiang provinces 


(Fig. 1, Table 1 and S1). After morphological examination and 
sequence analysis of the mitochondrial cytochrome b (mt-cyt b) gene, 
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these bats were assigned to 21 species. The species and their 
abundance varied among regions, with Miniopterus schreibersii 
(47%) in Guizhou, Rhinolophus ferrumequinum (45%) and R. pusillus 
(32%) in Henan, and R. monoceros (41%) and R. sinicus (50%) in 
Zhejiang as the predominant species. Notably, only R. pusillus bats 
were found in all five regions. 

Using RT-PCR targeting a conserved fragment of the RdRp (RNA- 
dependent RNA polymerase) gene of CoV as described previously (Lau 
et al., 2005; Wang et al., 2015), viral RNA was identified in a total of 73 
bat fecal samples, with an overall detection rate of 6.84% (Table 1). 
Phylogenetic analysis revealed that all these viral sequences were 
clearly closely related to coronaviruses. Among the predominant bat 
species, CoV prevalence was high in R. monoceros (28/119, 23.53%) 
and M. schreibersii (16/198, 8.08%), but was lower in R. sinicus (8/ 
219, 3.65%), R. ferrumequinum (2/183, 1.09%) and R. pusillus (1/ 
154, 0.64%). 

Of the 73 newly-identified CoVs, 32 belong to alphacoronaviruses 
and 41 to betacoronaviruses. To better characterize these newly- 
identified bat CoVs, the complete viral RdRp gene sequence was 
obtained from 65 (89%) of the viral RNA positive bat samples. In 
addition, 5 complete and 4 near-complete viral genome sequences were 
successfully obtained from CoV positive samples. 


2.2. Newly-identified SARS-related Rhinolophus bat coronaviruses in 
bats 


Genetic analysis of the conserved domains in the replicase poly- 
protein pplab — ADP-ribose 1-phosphatase (ADRP), nsp5 (3CLpro), 
nsp12 (RdRp), nsp13 (Hel), nsp14 (ExoN), nsp15 (NendoU) and nsp16 
(O-MT) — revealed that newly-identified bat betacoronaviruses shared 
more than 90% amino acid sequence identity with Severe acute 
respiratory syndrome-related coronavirus (SARSr-CoV) (Table S2) 
and clustered together in a phylogenetic analysis (Fig. 2). Hence, these 
data suggest that these bat viruses belong to the species SARSr-CoV 
according to the criteria for species demarcation in the subfamily 
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Fig. 1. A map of China illustrating the location of trap sites in which bats (red circles) were captured. 
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Prevalence of coronaviruses in the bats collected during 2012-2015 in Guizhou, Henan and Zhejiang provinces, China. 


Species Guizhou Henan Zhejiang Total 
Anlong Neixiang Lushi Jiyuan Longquan 
Aselliscus stoliczkanus 0/2 - - - - 0/2 
Barbastella beijingensis = = = 0/2 - 0/2 
Hipposideros armiger 1/35 - - - - 1/35 
Hypsugo savii - - - 0/1 - 0/1 
Miniopterus ricketti - - 0/1 - - 0/1 
Miniopterus schreibersti 14/179 2/19 = = = 16/198 
Murina leucogaster - 3/21 2/19 0/2 - 5/42 
Murina sp. - - - 0/3 - 0/3 
Myotis davidii 3/5 2/4 O/1 0/1 = 5/11 
Myotis siligorensis 1/4 - - - - 1/4 
Plecotus auritus - - - 0/4 - 0/4 
Rhinolophus affinis - 0/3 - - - 0/3 
Rhinolophus ferrumequinum = 0/8 0/18 2/157 = 2/183 
Rhinolophus luctus = = 0/2 = 0/1 0/3 
Rhinolophus macrotis 1/1 - - - - 1/1 
Rhinolophus pearsonii 2/30 0/7 - - 1/21 3/58 
Rhinolophus pusillus 1/22 0/45 0/57 0/29 0/1 1/154 
Rhinolophus rex 1/23 - - - - 1/23 
Rhinolophus sinicus 5/83 - - - 3/136 8/219 
Rhinolophus thomasi - - - - 1/1 1/1 
Rhinolophus monoceros 0/1 0/4 0/2 = 28/112 28/119 
Total 29/385 7/l11 2/100 2/199 33/272 73/1067 
Note: “-” no animals were captured. 


Coronavirinae defined by the International Committeeon Taxomony of 
Viruses (ICTV) (de Groot et al., 2011). As these viruses were from 
Rhinolophus bats we therefore designed them as SARS-related 
Rhinolophus bat coronaviruses (SARSr-Rh-BatCoV). Among these, 33 
SARSr-Rh-BatCoVs were identified in 28 R. monoceros, 1 R. pearsonii, 
3 R. sinicus and 1 R. thomasi sampled from the city of Longquan, 
Zhejiang province. Similarly, 2 SARS-related CoVs were identified in R. 
ferrumequinum sampled from the city of Jiyuan, Henan province, 
while the remaining 6 viruses were identified in 5 R. sinicus, and 1 R. 
rex sampled from the county of Anlong, Guizhou province. 

On the RdRp phylogeny (Fig. 2) all known SARSr-Rh-BatCoVs from 
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Fig. 2. Phylogenetic analysis o: 
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China could be divided into four clusters, within which the newly- 
identified viruses fell into three clusters that reflect their geographic 
origins. Specifically: (i) the viruses identified in Rhinolophus bats 
sampled in Zhejiang province (denoted Rhinolophus bat Longquan-) 
were closely related to each other and clustered with Rhinolophus bat 
CoV HKU3 sampled from R. sinicus in Hong Kong (Lau et al., 2005); 
(ii) The viruses identified in R. ferrumequinum from Jiyuan in Henan 
province (Jiyuan-84 and Jiyuan-331) formed a cluster with those 
viruses identified in R. ferrumequinum from other regions of China. 
The viruses within the cluster were from central China (Henan, Hubei 
and Shaanxi provinces), with the exception of the lineage comprising 
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the nucleotide sequences of the RdRp and S genes including those CoVs obtained here. Bootstrap values ( > 70%) are shown at relevant nodes. The trees 


were mid-point rooted for clarity only. The scale bar depicts the number of nucleotide substitutions per site. 
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JMC15 and BtRf-BetaCoV/JL2012 identified in R. ferrumequinum 
sampled in northeastern China (Jilin province); (iii) The viruses 
identified in R. sinicus sampled from Anlong in Guizhou province 
clustered with those identified in Rhinolophus bats from southwestern 
China including Guangxi, Guizhou and Yunnan provinces (Ge et al., 
2013; Li et al., 2005). Strikingly, only the bat SARSr-Rh-BatCoVs from 
southwestern China exhibited a close evolutionary relationship with 
SARS-related human coronavirus (SARS-CoV) and SARS-related palm 
civet coronavirus (SARSr-CiCoV) (Tor2 and SZ3), suggesting that 
SARS-CoV may have originated in this region. Finally, within each of 
these three clusters the SARS-related coronaviruses clustered accord- 
ing to their geographic origins. 

Unlike the RdRp gene tree, all SARSr-Rh-BatCoVs and SARS-CoVs 
from China fell into two distant clades on the S gene phylogeny (Fig. 2). 
The first included the SARS-CoVs and two bat SARSr-Rh-BatCoVs 
(Rs3367 and RsSHCO14), while the second comprised all the remain- 
ing bat SARSr-Rh-BatCoVs including the newly-identified CoVs, which 
could be further sub-divided into three clusters according to their 
geographic and/or host origins. Finally, additional analysis revealed 
that the S1 and S2 gene tree topologies differed from that of the S gene 
as a whole (Fig. S1), suggestive of potential recombination events (see 
below). 


2.3. Characterization of the SARSr-Rh-BatCoV genome 


To better characterize these newly-identified SARSr-Rh-BatCoVs 
we recovered the complete genome sequences from each of the lineages 
described. Their genome sizes varied from 29,665 to 29,693 nucleo- 
tides, and shared similar genome organizations with known SARS- 
related CoV viruses (Fig. 3), including the putative transcription 
regulatory sequence (TRS) motif, 5’-ACGAAC-3’, at the 3’ end of the 
5’ leader sequence and preceding each ORF except ORF7b. 
Additionally, a single long ORF8 was observed in all newly-identified 
SARSr-Rh-BatCoVs. 

The SARSr-Rh-BatCoVs from Longquan (Zhejiang) were closely 
related to the HKU3 strain with nucleotide identities ranging from 
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94.7% to 97.0% and amino acid identities of 98.2-99.1% in the RdRp. 
Notably, however, in the nsp2 gene they differed at up to 23% at the 
amino acid level. Additionally, the Longquan-140 virus was markedly 
different from SARSr-CiCoV at the amino acid level in the nsp2 
(25.3%), S (20.3%), ORF3 (17.5%), and ORF8 (62.9%) gene sequences. 
The SARS-related viruses from Jiyuan (Henan) were closely related to 
BtRf-BetaCoV/HeB2013 and BtRf-BetaCoV/SX2013, with 99.0-99.2% 
nucleotide identities. Strikingly, the SARSr-Rh-BatCoVs sampled from 
Anlong (Guizhou) were closely related to SARS- and SARSr-CiCoVs, 
with > 94% nucleotide identities. 


2.4. Recombination of SARSr-Rh-BatCoVs 


We next conducted an analysis of possible recombination events 
using available genome sequence of SARSr-Rh-BatCoVs and other 
betacoronaviruses, including those discovered in this study and SARS- 
and SARSr-CiCoVs (Tor2 and SZ3). As noted in Table S2, the 
Longquan-140 virus was markedly different from HKU3 in the nsp2 
gene, suggestive of possible recombination. When the Longquan-140 
sequence was used as the query for a sliding widow analysis with HKU3 
as a potential parent, two recombination breakpoints, located at the 
nsp2 (nucleotide 1727) and nsp3 (nucleotide 3055) genes, were 
detected in Longquan-140 with strong p-values (< 10~°°), and sup- 
ported by a similarity plot (Fig. 4). Indeed, there was an abrupt change 
in the topological position of Longquan-140 and HKU3 downstream of 
the first breakpoint (position 1727) and upstream of the second 
breakpoint (position 3055). Accordingly, the Longquan-140 genome 
can be divided into three regions with different evolutionary histories: 
regions A (nucleotides 1-1726), B 1727-3055) and C (3056 to the end 
of the genome sequence). In regions A and C, Longquan-140 was most 
closely related to HKU3 with 96.0% nucleotide similarities, while in 
region B it occupied basal position, suggesting the recombination event 
originated from a currently un-sampled parental virus. No other 
putative recombination events received significant statistical support. 
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Fig. 3. Genome organization of coronaviruses. The four CoVs discovered in this study are shown in bold. 
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Fig. 4. Recombination within the genome of the Longquan-140 virus. A sequence similarity plot (A) reveals two recombination breakpoints shown by black arrows with their locations. 
The plot shows genome-scale similarity comparisons of the Longquan-140 (query) against HKU3 and other selected SARS-related CoVs. Phylogenies of regions A, B and C are shown 
below the similarity plot. Numbers ( > 70%) above or below branches indicate percentage bootstrap values. 


2.5. Newly-identified alphacoronaviruses in bats 


All remaining bat viruses discovered here belong to genus 
Alphacoronavirus, and had genome organizations similar to those of 
known alphacoronaviruses (Fig. 3). In the RdRp phylogeny these 
viruses fell into seven distinct lineages within three clusters (Fig. 5). 
The first cluster included the two newly-identified lineages, comprising 
the virus (Neixiang-31) identified in one Myotis davidii bat from 
Neixiang (Henan) and those identified in Hipposideros armiger, M. 
davidii, M. siligorensis, R. macrotis, R. pearsonii, R. pusillus bats from 
Anlong (Guizhou). Further comparison of the CoV replicase domains 
(ADRP, 3CLpro, RdRp, Hel, ExoN, NendoU and O-MT) revealed that 
these newly-identified bat viruses, as well as the bat virus BtMr- 


AlphaCoV/SAX2011 (NC_028811.1), exhibited more than 10% amino 
acid difference in all seven replicase domains (Table S3). These viruses 
also formed a distinct cluster in all gene trees, and exhibited a clear 
cluster according to their geographic origins (Fig. 5 and S2). Together, 
these data suggest that these viruses satisfy the species demarcation 
criterion defined by ICTV and should therefore be considered as a novel 
member of the genus Alphacoronavirus, which we denote as Neixiang 
Md bat coronavirus (NMBV). 

Within the second cluster, the viruses discovered in bats from both 
Anlong (Guizhou province) and Neixiang (Henan province) formed a 
distinct lineage that was closely related to Scotophilus bat coronavirus 
512 (Sc-BatCoV 512) (Tang et al., 2006). As these viruses exhibited 
10% amino acid difference from members of the genus 
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Bat coronavirus HKU2 strain HKU2/HK/46/2006|Hong Kong|Rhinolophus sinicus 
100! BtRf-AlphaCoV/YN2012|China|Rhinolophus ferrumequinum 
76 Lucheng Rn rat coronavirus isolate Lucheng-19|Zhejiang|Rattus norvegicus 


100 


Mink coronavirus strain WD1127|USA|Mustela vison 1 


700 Feline coronavirus strain FCoV C1Je|-|feline 
100Lr 1} GEV virulent Purdue|USA|porcine 
100 Canine coronavirus strain K378|USA|Canis lupus familiaris 
Human coronavirus NL63 strain Amsterdam I|The Netherlands|Homo sapiens 
Human coronavirus 229E|USA|Homo sapiens 


100} Camel alphacoronavirus isolate camel/Riyadh/Ry141/2015|Saudi Arabia|camel 2 
100! Alpaca respiratory coronavirus isolate CA08-1/2008|USA|Vicugna pacos 
Rousettus bat coronavirus HKU10|Guangdong|Rousettus leschenaulti 13 


Neixiang Md bat coronavirus Neixiang-31|Henan|Myotis davidii 


BtMr-AlphaCoV/SAX201 1|China|Myotis ricketti 


100 Neixiang Md bat coronavirus Anlong-54|Guizhou|Myotis davidii 


99 


Neixiang Md bat coronavirus Anlong-38|Guizhou|Hipposideros armiger 


Neixiang Md bat coronavirus Anlong-57|Guizhou|Myotis davidii 4 
100} Neixiang Md bat coronavirus Anlong-43|Guizhou|Myotis siligorensis 
Neixiang Md bat coronavirus Anlong-44|Guizhou|Rhinolophus macrotis 
Neixiang Md bat coronavirus Anlong-60|Guizhou|Myotis davidii 
Neixiang Md bat coronavirus Anlong-46|Guizhou|Rhinolophus pusillus 
Neixiang Md bat coronavirus Anlong-40|Guizhou|Rhinolophus pearsonii 
Bat coronavirus CDPHE15/USA/2006|USA|Myotis lucifugus 
Porcine epidemic diarrhea virus CV777|Chinal|piglets 
86 Bat coronavirus JTAC2|Jilin|Murina leucogaster 


84 38 


Lushi MI bat coronavirus Neixiang-52|Henan|Miniopterus schreibersii 
Lushi MI bat coronavirus Neixiang-23|Henan|Murina leucogaster 5 


72}| Lushi MI bat coronavirus Neixiang-27|Henan|Murina leucogaster 


94 92 


Lushi MI bat coronavirus Neixiang-14|Henan|Murina leucogaster 


Lushi MI bat coronavirus Lushi-212|Henan|Murina leucogaster 
Lushi MI bat coronavirus Lushi-216|Henan|Murina leucogaster 
Scotophilus bat coronavirus 512/2005|Hainan|Scotophilus kuhlii 
98 Anlong Ms bat coronavirus Anlong-190|Guizhou|Miniopterus schreibersii |6 
100! Anlong Ms bat coronavirus Neixiang-32|Henan|Myotis davidii 
Miniopterus bat coronavirus HKU8|Hong Kong|Miniopterus pusillus 
99] | BtMf-AlphaCoV/FJ2012|Fujian|Miniopterus fuliginosus 
100! Miniopterus bat coronavirus Neixiang-64|Henan|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-12|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-78|Guizhou|Rhinolophus pearsonii 
99 94] | Miniopterus bat coronavirus Anlong-8|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-171|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-206|Guizhou|Miniopterus schreibersii 
ia Miniopterus bat coronavirus Anlong-3|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-36|Guizhou|Miniopterus schreibersii 7 
Miniopterus bat coronavirus Anlong-147|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus 1A AFCD62|Hong Kong|Miniopterus magnater 
Miniopterus bat coronavirus Anlong-248|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-211|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-213|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-172|Guizhou|Miniopterus schreibersii 
Miniopterus bat coronavirus Anlong-158|Guizhou|Miniopterus schreibersii 
0.05 Miniopterus bat coronavirus Anlong-272|Guizhou|Miniopterus schreibersii 


Fig. 5. Phylogenetic analyses of the amino acid sequences of the RdRp including those CoVs obtained here. Numbers ( > 70%) above or below branches indicate percentage bootstrap 
values. The trees were mid-point rooted for clarity only. The scale bar represents the number of amino acid substitutions per site. 


Alphacoronavirus in the conserved replicase domains (with the 
exception of O-MT to Sc-BatCoV 512 (9.3%; Table S2)) it is possible 
that Anlong-190 and Neixiang-32 represent a novel CoV species which 
we denote as Anlong Ms bat coronavirus. However, further studies are 
clearly needed to determine whether this virus indeed represents a 
novel coronavirus. The remaining viruses, which were discovered in 
bats sampled from Lushi and Neixiang and designated as Lushi Ml bat 
coronaviruses, formed another distinct lineage that showed a close 
evolutionary relationship with the bat virus JTAC2 (KU182966). 
Remarkably, these viruses were most closely related to porcine 
epidemic diarrhea virus (PEDV) in all five gene trees (Fig. 5 and S2) 
and exhibited >10% difference in six conserved replicase domains 
from known members of the genus Alphacoronavirus in other genes 
(Table S3). However, the O-MT gene of Lushi MI bat coronaviruses was 
most closely related to that of PEDV with 7.3% amino acid difference, 
indicating that it is a novel variant of PEDV (designed as Lushi Ml bat 


coronavirus). Clearly, these data support the evolutionary origin of 
PEDV from bats (Huang et al., 2013). 

Within the third cluster, the newly-identified bat viruses fell into 
three lineages. The newly-identified virus (Neixiang-64) in Miniopterus 
schreibersii from Neixiang formed a lineage with viruses in M. 
fuliginosus (KJ473799.1), and showed a close relationship with the 
HKU8 virus (Chu et al., 2008). Notably, the viruses from Anlong were 
grouped into two lineages, one of which contained Miniopterus bat 
coronavirus 1 A AFCD62 (Chu et al., 2008). In sum, our data reveal a 
high genetic diversity of alphacoronaviruses from bats in China. 


2.6. Phylogenetic relationships between newly-identified viruses and 
their bat hosts 


Comparison of the tree topologies of SARSr-Rh-BatCoVs and their 
bat hosts revealed strong incongruence between the SARSr-Rh- 
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Fig. 6. Co-phylogenetic analyses of bats hosts and their associated coronaviruses. (A) SARS-related coronaviruses from bats, (B) alphacoronaviruses from bats. The coronavirus tree is 
shown in blue while the host phylogeny is shown in black. The host tree was based on mitochondrial cytochrome b gene sequences, and the coronavirus trees were based on the RdRp 
gene. Filled circles at the nodes indicate co-divergence events, empty circles indicate lineage duplication events, arrows indicate host-switching events, while dotted lines indicate loss 


events. 
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BatCoVs and their Rhinolophus bats hosts at the species level, with 
only two likely co-divergence events (Fig. 6A). Similarly, our co- 
phylogenetic analysis of alphacoronaviruses and their bat hosts pro- 
vided evidence for 6—7 co-divergence events, 11-12 host switching 
events, 0 lineage duplication, 1—2 losses and 0 failure to diverge events 
(Fig. 6B). Overall, our co-phylogenetic analysis did not identify 
significant congruence between the phylogenetic trees of viruses and 
their bat hosts (P=0.04). 

Clearly, more SARSr-Rh-BatCoVs were identified in R. sinicus and 
R. ferrumequinum bats sampled from a variety of locations (Fig. 2), as 
well as in R. monoceros sampled from Zhejiang province. In addition, 
the same virus could be identified in several species from the same 
geographic locality. For example, SARSr-Rh-BatCoVs were identified in 
four species of Rhinolophus bats from Longquan (Zhejiang) and in two 
bat species from Anlong (Guizhou). Similarly, Neixiang Md _ bat 
coronavirus (alphacoronavirus) was identified in six bat species 
sampled in Anlong (Guizhou). Hence, it is possible that high contact 
rates among some animal species may enable the acquisition and 
spread of CoVs. 


3. Discussion 


Since the discovery of SARSr-Rh-BatCoVs in bats in 2005 (Lau 
et al., 2005; Li et al., 2005), intense effort has focused on characterizing 
additional CoVs in bats, leading to the identification of diverse SARS- 
like viruses and other bat CoVs worldwide (Drexler et al., 2014; Hu 
et al., 2015; Huang et a, 2016; Woo et al., 2006). More importantly, 
recent studies provide more evidence that human CoVs including 
MERS and SARS viruses may have their ultimate ancestry in bats 
(Annan et al., 2013; Corman et al., 2014, 2015; Ge et al., 2013; Huynh 
et al., 2012; Hu et al., 2015; Ithete et al., 2013; Tao et al., 2017). These 
results highlight the potential significance of detecting and character- 
izing of CoVs in bats before their emergence in humans. Herein we 
describe diverse SARSr-Rh-BatCoVs and alphacoronaviruses (includ- 
ing novel species) in 13 species of bats sampled from three provinces in 
China, with an overall prevalence of 6.84%. As such, these data reveal 
both the high genetic diversity and the wide geographic distribution of 
CoVs in diverse bats in China. 

Among the known SARSr-Rh-BatCoVs, those discovered by Yang 
et al. (2015) are most closely related to human SARS CoVs in the S 
gene ( > 90% amino acid similarities), but distant in the ORF8 gene (< 
43% amino acid similarities). Interestingly, other CoVs have been 
described that are closely related to SARS-CoVs in the ORF8 gene, but 
relatively distant in the S, nsp2, nsp11, and ORF3 genes (Table S2) 
(Lau et al., 2015; Wu et al., 2016). In this study, with the exception of 
the S and ORF3 genes, the viruses discovered in bats sampled in 
Guizhou province were most closely related to SARS-CoVs including 
the ORF8 gene. That all these SARSr-Rh-BatCoVs were sampled in 
southwestern China provides evidence for the origin of SARS-CoVs 
from bats in this region. 

To date, SARSr-Rh-BatCoVs have been detected in bats sampled 
from diverse geographic regions in China as well as in Europe (Drexler 
et al., 2010). Notably, the majority of SARSr-Rh-BatCoVs have been 
discovered in Chinese Rhinolophus bats, while only one study reported 
the detection of SARS-related CoVs in Chaerephon plicata bats 
sampled from Yunnan province (Yang et al., 2013). In our study a 
total of 1067 bats from 21 bat species were captured, with SARSr-Rh- 
BatCoVs identified in 5 species of Rhinolophus bats. Although previous 
studies indicated that R. sinicus and R. ferrumequinum bats exhibited 
a higher prevalence of SARS-related viruses (Lau et al., 2005; Li et al., 
2005; Tang et al., 2006), more SARSr-Rh-BatCoVs were identified in R. 
monoceros bats (Table S1). In sum, these data suggest that SARSr-Rh- 
BatCoVs may be specifically associated with Rhinolophus bats. 

It is commonly stated that all the alphacoronaviruses originate from 
bat species (Woo et al., 2012), as may also be the case for the human 
CoVs NL63 and 229E (Corman et al., 2015; Tao et al., 2017). In 
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addition, it is believed that PEDV might have originated from bats, 
although direct evidence is lacking (Huang et al., 2013). In this study, 
Lushi MI bat coronaviruses discovered in bats (M. leucogaster, M. 
schreibersii) from Lushi and Neixiang exhibited a close relationship to 
PEDV, especially in the O-MT domain (92.7% amino acid identity). In 
addition, all other viruses within the cluster were sampled from bats. 
Hence, these data provide more compelling evidence that PEDV may 
have originated from bats. 

The evolutionary history of RNA viruses is characterized by both 
host switching and co-divergence (Li et al., 2015; Shi et al., 2016), 
which also appears to be true of coronaviruses (Annan et al., 2013; Cui 
et al., 2007; Drexler et al., 2014; Lau et al., 2010, 2012, 2013; Tang 
et al., 2006; Vijaykrishna et al., 2007; Wertheim et al., 2013; Woo et al., 
2009, 2012). Similarly to previous studies (Drexler et al., 2010; Ge 
et al., 2013; Lau et al., 2005; Li et al., 2005; Wu et al., 2016; Yang et al., 
2013), SARSr-Rh-BatCoVs were mainly identified in Rhinolophus bats, 
but not in other bats even when sampled in the same locality (e.g. in 
Guizhou and Henan). In contrast, alphacoronaviruses were mainly 
identified in non-Rhinolophus bats and no alphacoronaviruses were 
discovered in the colony that only comprised Rhinolophus bats. Finally, 
it was noteworthy that a SARSr-Rh-BatCoVs and Neixiang Md bat 
coronavirus (alphacoronavirus) were identified in several bat species 
(from up to three families) in the localities of Anlong and Longquan, 
indicative of local inter-species transmission. Hence, these data suggest 
that high contact rates among specific animal species enable the 
acquisition and spread of CoVs. 


4. Material and methods 
4.1. Bat trapping and specimen collection 


Bats were captured alive with mist nets or harp traps in caves of 
natural roosts in Guizhou, Henan, and Zhejiang provinces during 
2012-2015 (Fig. 1). Bat species were initially identified by morpholo- 
gical examination and further confirmed by sequence analysis of the 
mt-cyt b gene (Guo et al., 2013). All bats were anesthetized before 
surgery with every effort made to minimize suffering. Tissue samples, 
including those from the rectum, were collected from bats for 
coronavirus detection. 


4.2. DNA and RNA extraction, PCR and sequencing 


We extracted total DNA from bat tissue samples using the DNeasy 
Blood & Tissue kit (QIAGEN, Valencia, USA) according to protocols 
suggested by the manufacturer. Total RNA was extracted from fecal or 
tissue samples using TRIzol (Invitrogen, Carlsbad, CA) according to the 
manufacturer's instructions. Coronavirus RNA was detected by RT- 
PCR as described previously (Lau et al., 2005; Wang et al., 2015). 
Other coronavirus gene sequences were amplified using the primers 
designed based on the conserved regions of known genome sequences. 

RT-PCR amplicons were purified using the QIAquick Gel Extraction 
kit (Qiagen, Valencia, USA) according to the manufacturer's recom- 
mendations. Purified DNA < 700 bp was subjected to a direct sequen- 
cing protocol, while purified DNA > 700 bp was cloned into pMD18-T 
vector (TaKaRa, Dalian, China), which was subsequently transformed 
into JM109-143 competent cells. DNA sequencing was performed with 
Applied Biosystems 377 gene sequencers. 


4.3. Complete genome sequencing 


Four representatives of CoV in bats were selected for full-genome 
sequencing. The initial PCR primer sets for PCR were designed from 
each pan-CoV amplicon sequence and/or from a conserved region in 
the CoV RdRp. As required, walking primers were designed for further 
PCR and sequencing. All primer sets used in this study are available 
upon request. 
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All sequences generated in this study have been deposited in 
GenBank and assigned accession numbers KF294268-KF294282, 
KF294373-KF294378, KF294381-KF294383, KF294420-KF294457, 
KY770850-KY770860. 


4.4. Phylogenetic analysis of CoV sequences 


In addition to the sequences recovered here, reference sequences 
that cover the phylogenetic diversity of CoVs were compiled for 
evolutionary analyses. Accordingly, sequence alignments were per- 
formed using the MAFFT algorithm (Katoh and Standley, 2013). After 
alignment, gaps and ambiguously aligned regions were removed using 
Gblocks (v0.91b) (Talavera and Castresana, 2007). Phylogenetic trees 
were estimated using the maximum likelihood (ML) method imple- 
mented in PhyML v3.0 (Guindon et al., 2010) with bootstrap support 
values calculated from 1000 replicate trees. The best-fit amino acid 
substitution models were determined using MEGA version 5 (Tamura 
et al., 2011). 


4.5. Recombination analysis 


Full-length genomic sequences of the SL-CoVs Longquan-140, 
Anlong-103 and Jiyuan-84 viruses were aligned with those of bat 
SARSr-Rh-BatCoVs and other betacoronaviruses using MEGA 5.0. The 
aligned sequences were preliminarily scanned for recombination events 
using the Recombination Detection Program 4.0 (RDP4), employing 
the RDP, GENECONV, and Bootscan methods (with default para- 
meters) (Martin et al., 2015). The potential recombination events 
suggested by RDP (i.e. those with strong P values) were investigated 
further by similarity plots implemented in Simplot 3.5.1 (Lole et al., 
1999). For each of the putative recombinant regions, phylogenies were 
estimated using the maximum likelihood method available in PhyML 
v3.0 (Guindon et al., 2010). 


4.6. Analysis of CoV and host co-phylogenies 


Co-phylogenetic analyses of bats hosts and their associated cor- 
onaviruses were conducted using the heuristic event-based method 
available in the Jane software package (Conow et al., 2010). We 
reconstructed patterns co-divergence (and hence cross-species trans- 
mission) using a weight of 0 for co-divergence and a weight of 1 for 
duplication, host switching, lineage loss and failure to diverge. We used 
Jane with 100 generations and a population size of 100 as parameters 
for the genetic algorithm. To test the probability of observing the 
inferred co-divergence number by chance, we employed the random tip 
mapping method with the generation number =100, population size 
=100, and sample size =50. 
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